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Abstract. This is an introduction to some of the most probabilistic aspects of free 
probability theory. 



Introduction 

Free probability is a non-commutative probability theory, in which the concept 
of independence of classical probability is replaced by that of freeness. This new 
concept incorporates both the probabilistic idea of non-correlation involved in the 
independence of random variables, and the algebraic notion of abscence of relations, 
like e.g. between the generators of a (non-abelian) free group. It was introduced 
abstractly by D. Voiculescu around 1983, in order to study some problems in the 
theory of von Neumann algebras, but some years later, around 1990, he realized 
that a concrete probabilistic model of free random variables is afforded by large 
independent random matrices. While this discovery has lead to an impressive series 
of advances in von Neumann algebras, some long standing open problems being 
solved with these new ideas, we shall not discuss these applications here, rather 
the purpose of these lectures is to try to explain to a probabilist audience why 
the theory of free random variables is an interesting and beautiful subject in itself, 
with deep connections with classical probability, and other related areas, such as 
harmonic analysis, and combinatorics. We hope that this brief introduction will 
help the reader find its way into the litterature on free probability. We shall take 
as a departure point a natural problem about hermitian matrices, whose solution 
involves free probability theory. 

2. Large matrices 

A frequent question, occuring both in mathematics and physics, is to determine 
the spectrum of the sum of two hermitian matrices. Knowing the respective spectra 
of two N x N hermitian matrices A and B, the set of possible spectra for A + B is a 
subset of R , which can be described explicitly and depends in a complicated way 
on the spectra of A and B. This problem is not an easy one, and indeed its solution 
was obtained only quite recently (see e.g. [F] for a discussion). When iV becomes 
large, however, a remarkable phenomenon occurs, and it turns out that, roughly 
speaking, for almost all choices of the matrices A and B, with given spectra, the 
spectrum of A + B is essentially the same, and can be computed explicitly, without 
knowing the detailed structure of the matrices A and B (i.e. their eigenvectors). 
In order to give a rigourous mathematical statement corresponding to the above 
assertions, we shall first introduce a probability measure on the set of matrices 
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having the same spectrum as A. By the spectral theorem this set consists of all 
matrices UAU*, where U describes the group of unitary N x N matrices. It is the 
orbit O a of A under the adjoint action of U(N), and as such it carries a unique 
invariant probability measure pa, the image of the Haar measure on U(N) by the 
map U i-> UAU*. 

Given an hermitian matrix A, with spectrum Ai, . . . , Xn (counted with multi- 
plicities), we denote by va the empirical distribution on the set of its eigenvalues, 
i.e. the probability measure 

1 N 

3=1 

Knowledge of the spectrum (with multiplicities) and of va are equivalent. 
We can now state the result we had in mind. 

Theorem 1. Let for each N positive integer, A^ and B^ be two hermitian ma- 
trices, whose norm is bounded uniformly in N , and let v\ and v 2 be two probability 
measures with compact supports on R, such that va n — > v\ and vb n — > v 2 weakly 
as N — > 00, then there exists a probability measure, depending only on v\ and v 2 , 
denoted v\ EH v 2 , such that va> +b' ~^ v \ EH v i weakly, in probability, as N — > 00, 
where A' N and B' N are random matrices chosen independently, with respective dis- 
tributions pa n and pb n - 

Thus, if we are given two N x N hermitian matrices (with N large), and we 
only know their spectra, then we can bet, with a good chance to win, that the 
measure va+b is well approximated by the measure va EH vb- In other words, 
loosely speaking, we can compute the spectrum of A + B, knowing only the spectra 
of A and B\ 

Since any probability measure with compact support can be approximated by 
measures like va for arbitrarily large matrices, Theorem 1 introduces a binary 
operation EH on the set of compactly supported measures on the real line which, for 
reasons to be explained below, we shall call the free convolution of measures. This 
binary operation is clearly associative and commutative. 

It is instructive to compare the preceding result with the case where all considered 
matrices are supposed to be diagonal. Let A be a self-adjoint diagonal matrix, then 
the set of all diagonal matrices having the same spectrum as A, the analogue of the 
set Oa, is an orbit of the symmetric group Sn, acting by permutation of diagonal 
entries. Let us denote the normalized counting measure on this set by £a, then one 
has, denoting by p * v the usual convolution of two probability measures on R, 

Theorem 2. Let for each N , A^ and B^ be two real diagonal matrices, and let 
v\ and v 2 be two probability measures with compact supports, such that va n — > v\ 
and vb n — > v 2 weakly as N — > 00, then va> +b' ^ vi * v 2 weakly, in probability, 
as N — > 00, where A' N and B' N are random matrices chosen independently, with 
respective distributions £a n and £b n ■ 

Although we do not know an adequate reference, this last result is probably well 
known. In fact, it is not difficult to deduce it from known concentration of measure 
results on the symmetric group, as e.g. in [M]. 

Coming back to free convolution, note that we have not said how to compute 
explicitly the measure v\ EH v 2 in terms of v\ and v 2 . This is where free probability 
comes in. As we shall see in the next section, Theorem 1 is the consequence of a 
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more general result concerning large hermitian matrices, namely the fact that they 
give rise, asymptotically, to free random variables. 

3. Free random variables 

Before we describe the connection with large matrices, we introduce the necessary 
notions from free probability. We start with some purely algebraic definitions. 

Let A be a complex algebra with a unit, and <p a C- valued linear form on A, 
such that (p(l) = 1. Although, in the examples that we shall consider, the algebra 
A will always be non-commutative, it will be convenient to think of the elements 
of this algebra as random variables, while the map <p should be considered as the 
expectation map of classical probability theory (here and in the sequel, we use 
the word classical for usual probability theory, as opposed to the non-commutative 
theory we develop). 

We now introduce the basic notion of free probability theory. 

Definition 1. Let I be a set of indices, andBi, fori G /, be subalgebras of A, con- 
taining the unit, then the algebras Bf, i G I are called free if one has (p(a± . . . a n ) = 
each time <p(aj) = for all j = 1, . . . ,n and aj G Bi j for some indices %\ ^ %i ^ 
• • • ^ in- 

Pursuing our analogy with classical probability, if we think of the algebras Bi as 
algebras of random variables, measurable with respect to some sub-sigma field, then 
the above definition is a non-commutative analogue of the definition of independent 
subalgebras. In fact, although it might not be obvious at first sight, this definition 
captures the essence of both the notion of algebraic independence, and that of 
independence of sigma-algebras in classical probability. Let us stress however, that 
this definition is not a non-commutative extension of the notion of independence, 
indeed algebras generated by independent random variables, in the sense of classical 
probability theory, are not free in the sense of the above definition. 

Before making some elementary comments, we shall give a convenient definition, 
whose analogy with classical probability should be obvious. 

Definition 2. LetTd;i & I be subsets of A, they are called free if the subalgebras 
Bi(= unital algebra generated by Hi), for i E I, are free. 

In order to get acquainted with freeness we shall make a few computations. So 
let £>, C C A be two free subalgebras, and b G £>, c G C, then we can write b = b' + b 
where b = (p(b)l, so that <p(b') = 0. Similarly we have c = d + c. Then using the 
definition of freeness, we see that ( p(b'd) = 0, thus 

<p(bc) = <p((b' + b)(d + c)) 

= ip(b'd + b'c + Id + be) 
= <p(b)(p(c) 

Here we have used <p(bd) = (p((p(b)l.d) = (p(b)(p(d) = 0, and similarly (p(b'c) = 0. 

This means that for two free elements b and c, their expectations factorize, 
exactly as in the case of independent variables. Let us now look at some more 
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subtle example. Take 61, 62 G £>, and ci, C2 G C, then we can expand 



V?(6ic 1 6 2 c 2 ) = <p((b[ + 6i)(ci + ci)(«/ 2 + 6 2 )(c 2 + 62)) 



= ^(^C^C^ + 61^62^2 + &1C162C2 + b' 1 c' 1 b2C2 + 

&iCi& 2 c 2 + 6'iCi6 2 c 2 + 61C162C2 + 61C162C2+ 
bic^c^ + 61^62^2 + + 61C162C2+ 

6ici6' 2 c 2 + 6ici6' 2 c 2 + 6ici6 2 c' 2 + 6ici& 2 c 2 ) 



Using the fact that ^(^c^c^) = by the freeness property, we are left with 
terms in which some expectation factorizes. For example we shall treat the term 
<p(&i C1&2C2) = { p{. c i) { pii > ' 1 b' 2 c'2) ■ By the factorization property already obtained one 
has <p(b'ib 2 c 2 ) = vK&i&^vO^) = 0- The other terms can be treated by similar con- 
siderations, and after some straightforward manipulations we arrive at the formula 

(p(b 1 c 1 b 2 c 2 ) = <p(bib2)(p(ci)(p(c2) + <£>(&iM&2Mcic 2 ) - ¥>(&iM & 2M c iM c 2) 

We observe that the computation of the expectation of the product b\C\b 2 C2 can 
be reduced to the computation of expectations in the subalgebras B and C. This 
also shows that the result is different from the one we would have obtained with 
independent (commuting) random variables. 

It is not difficult to see that the above computation can be generalized. More 
precisely, taking care of how the terms are successively reduced, the reader should 
check, and it is a good exercise do so, that the following is true 

Proposition 1. Let Bf,i G I be free subalgebras in A, and at,. . . ,a n e A such 
that for all j = 1, . . . , n, one has aj G Bi j for some ij G /. Let IT be the partition 
of {1, . . . , n} determined by j ~ k if ij = For each partition % of {1, ... , n}, 
let = Ylui,... ,j r }evr v{ a h ■ ■ - a jr)' then there exists universal coefficients c(tt, IT), 



where the sum is over partitions it which are finer than IT. 

In particular, this shows that <fi(a\ . . . a n ) can be computed explicitly in terms 
of the restriction of (p to the algebras Bi. This is a reminiscence of the fact that 
the joint distribution of a family of independent random variables is completely 
determined if we know the distribution of each of the random variables. 

As the example that we treated above suggests, the algorithm we have described 
for computing coefficients c(ir, IT) leads quickly to intractable calculations. Finding 
an explicit formula for these coefficients is a non trivial combinatorial problem, 
which fortunately has been solved by R. Speicher. We shall come back to describe 
his solution later. Let us just mention for the moment that it involves a certain 
class of partitions, called "non-crossing" . 

It is time now to state a result showing that the notion we have introduced is 
meaningful, i.e. that there exist non-trivial examples. In this respect the situation 
is optimum, and we have 
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Theorem 3. Let Bf,i E I be complex unital algebras, equipped with normalized 
(i.e. <Pi(l) = 1) linear forms <p>i, then there exists an algebra A, with a normalized 
linear form <p, and unital infective morphisms ti : Bi — > A, satisfying <p>i = <p o t i} 
and the algebras ti{Bi); i E I, are free in A. 

The construction of the algebra A consists in enforcing the freeness condition in 
a rather straightforward way. One defines it as the "free amalgamated product" of 
the Bi, more precisely, denote by B\, for all i 6 I, the subspace of elements in Bi 
with zero expectation, and then take for A the direct sum of CI (1 is the unit in 
A), and all spaces B' ix <S> ■ ■ ■ <S> B' in , where i\ . . . i n runs over all finite sequences in / 
satisfying %\ ^ %i ^ . . . ^ i n . In order to turn A into an algebra, we need to specify 
the product of two elements (ai <8> . . . <8> a n )(pi <S> ■ ■ ■ <S> b m ). If a n and b\ belong to 
distinct algebras, then the product is simply ai <E> • • • <8> a n <8> b± <8> • • • <E> b m , which 
belongs to A. If a n and b\ belong to the same algebra Bi, then <Pi(a n bi) is not zero 
in general, so we write a n bi = (a n bi)' + a n b\, and put 

(ai <g) . . . <g) a„)(6i <g) . . . <g> b m ) =a x <g> . . . <g> a n _i <8> (a n bi)' ® b 2 <g> . . . <E> b m 

+ Vi(a n 6i)(ai <S> ■ ■ ■ <S> a n _i)(6 2 <8> ■ ■ ■ <S> b m ) 

Since the "length" of {a\ <8> . . . <8> a n _i) (62 <£>■■■<£> b m ) is m + n — 2, the computation 
can be repeated, and we are done after a finite number of steps. The injection Li is 
defined by ti(b) = tpi(b).l + b' ', and the linear form ip by (p(a.l) = a for a G C, and 
(p = on each space B' ix ® . . . <g> . It is not difficult to check that the pair (A, <p) 
fulfills the required conditions. 

4. Free convolution 

We are now going to be more specific about the kind of algebras that we are 
considering. Since we want to do some probability theory, it will be convenient to 
be able to say that a function of a random variable is again a random variable, 
and that a random variable X has a distribution, namely a probability measure 
fix, such that <p(f(X)) = f f(x)/j,x(dx) for any bounded function /. The cleanest 
way to do that is to assume that our algebra A is a von Neumann algebra, which 
means that it is an algebra of bounded operators in some complex Hilbert space 
H, closed under taking adjoints of operators, and under taking limits for the weak 
operator topology (i.e. simple weak convergence on H). Since we want ip to be an 
analogue of the expectation map with respect to a probability measure, we need 
some positivity assumption. Positivity here will be taken in the sense of operator 
theory, namely an element X of A will be positive if it is a self-adjoint positive 
operator on H. The positivity requirement on ip is that it takes nonnegative values 
on positive operators in A. We shall also assume that <p> is continuous for the weak 
operator topology. You do not need to be familiar with von Neumann algebras in 
order to be able to read the following, in fact the only property of a von Neumann 
algebra that we will use is the stability with respect to functional calculus, namely 
if X G A is a self-adjoint operator on H, then the operator f(X), which can be 
defined by functional calculus for any bounded Borel function / on R, still belongs 
to the algebra A. Also, if X is self-adjoint then the map / 1— > (p(f(X)), defined for 
Borel bounded functions, is given by / 1— > f R f(x)u,x(dx), for a unique probability 
measure fix, with compact support on R. The probability measure fix is called 
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the distribution of X. Thus, self-adjoint operators in A behave somewhat like (real 
valued) random variables. 

If A is a von Neumann algebra, and <p is a normalized, weakly continuous, positive 
linear form on A, we shall call the couple (A, <p) a non-commutative probability 
space. 

It turns out that the amalgamated free product of algebras exists in the category 
of non-commutative probability spaces, namely if the (£>j, ipi) in Theorem 3 are non- 
commutative probability spaces, then we can choose for (A, <p) a non-commutative 
probability space, and the ti are weakly continuous. This can be seen by applying 
the GNS construction to the amalgamated free product constructed after Theorem 
3. We refer to [VDN] for a detailed proof. 

We are now in position to give the free-probabilistic definition of the free convo- 
lution of measures, which was introduced in Theorem 1. Let fi and v be probability 
measures with compact support on R, then there exists a non- commutative proba- 
bility space (A, <p) and self-adjoint elements X, Y in A with respective distributions 
fi and v, such that X an Y are free. In order to construct (A, (p) and X, Y, we 
can do the following: take the algebra L°° (R, fi) , which can be considered as a von 
Neumann algebra of operators on L 2 (/x) , elements of L°° (R, fi) acting by multipli- 
cation on L 2 (fi). The expectation map tp^ (integration with respect to fx) turns the 
pair (L°°(R, /j), ip^) into a non-commutative probability space. The map x i— > x 
on R defines a self-adjoint element of L°° (R, fi) (since \i has compact support) , 
which we call X . By the general construction described above, there exists a non- 
commutative probability space, containing (L°°(R, /x), <p M ) and (L°°(R, v), <p u ) as 
free subalgebras, and thus we get a non-commutative probability space containing 
two free elements X and Y with respective distributions fi and v. Since X and Y 
are bounded selfadjoint operators, the distribution of X+Y is a probability measure 
with compact support on R, and it is characterized by its moments. Expanding 
the expression (p((X + Y) n ) and using Proposition 1, we see that the moments 
<p((X + Y) n ) can be computed as polynomial functions of the moments of fi and v, 
so that the distribution of X + Y really depends only on the measures \i and v and 
not on the particular realization of the operators X and Y, thus we can define fiSv 
as the distribution of X + Y, where X and Y is any pair of free random variables, 
with respective distributions it and v. Clearly this definition makes the operation 
EH the free analogue of the convolution of measures in classical probability. 

We shall now recover Theorem 1 from the following result, which shows that 
large hermitian matrices give an asymptotic model for free random variables. 

Theorem 4. Let An, Bn, v\ and v<i be as in the statement of Theorem 1, and 
let X , Y be two self-adjoint elements of some non- commutative probability space 
(A,<p), with respective distributions v\ and z/ 2 , then for every non- commutative 
polynomial in two indeterminates, P, one has jjtr(P(A' N , B' N )) — > (p(P(X,Y)) in 
probability, as N — > oo, where A' N and B' N are random matrices chosen indepen- 
dently, with respective distributions pa n and ps N - 

Applying Theorem 4 to the non-commutative polynomials P n (X, Y) = (X+Y) n , 
we see that the moments of A' N + B' N converge in probability towards that of X + Y. 
Since we are dealing with compactly supported measures, we conclude to the weak 
convergence of the associated distributions, and we get Theorem 1. 

We shall not give a proof of Theorem 4 here, since this is a far from easy result. 
This is a consequence of a more general result from the paper [Vo2] (see also [VDN]), 
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and another, more direct proof can also been found in [X]. 

It is now time we give a formula for computing the free convolution of two mea- 
sures. For convolution of probability measures, one way to perform this computa- 
tion is to use the Fourier transform, which converts convolution into multiplication. 
Observe that for probability measures with compact support, the logarithm of the 
Fourier transform can be expanded into a power series 



„ oo 

log(/ e itx ^dx)) = J2^)(ity 

jR n=l 



This follows from a substitution of the power series 

/ e itx fi(dx) = 1 + J2 ^ / x n /j(dx) 
Jr n=1 n - Jr 

into the expansion of log(l + z). The coefficients a n (fx) are polynomial functions 
in the moments of [i, called the cumulants of the measure fi, and the multiplica- 
tivity property of the Fourier transform of convolution shows that these cumulants 
linearize the convolution, namely a n (fi * v) = a n (fi) + a n (u). It turns out that 
the free convolution admits a similar description in terms of so-called "non-crossing 
cumulants" , which are defined as follows. Let 

G>(C) = / -^—Mt) 

Jr s — 1 

be the Cauchy transform of \i. This defines an analytic function onC\l, such that 
G/j,(() = G>(£), and G M (C + ) C C~ where C + and C~ denote respectively the sets 
of complex numbers with positive and negative imaginary part. We can expand 
this function in a Laurent series involving the moments of \x 



oo „ 
n=0 J ^ 



x n fx(dx) 



This series can be inverted formally with respect to the formal inverse having 
the form 



n 

n=l 



z n-l 



where the coefficients C n are polynomial functions of the moments nik = J R x k n{dx) 
of \x. The most convenient way to proceed in order to compute the C n is to write 
the equation defining as 

oo 

G^OK^GM) = 1 + E CnG^CT = (G,(() 

n=l 

Equating the coefficients of (~ n in the second and third member of the above 
equality, we can evaluate the C n in terms of the moments recursively. The first few 
values are 

C\ = mi 

C 2 = m 2 - ml 

C3 = TO3 — 3mim2 + 2m\ 
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These formulas can be inverted, and we get the moments in terms of the C n as 

mi = C\ 
m 2 =C 2 + C\ 
m 3 =C 3 + 3dC 2 +Cf 

The coefficients C n , which we shall call the free cumulants of the measure fx play 
the role of cumulants for the free convolution, namely one has 

Theorem 5. For all compactly supported measures \x and v, on M., and all n > 1, 
one has 

C n (iamv) = C n (fu) + C n (is). 

Since one can recover the moments of a measure from its free cumulants, this 
determines completely the measure \i EH v. 

This result was first proved by Voiculescu in [Vol], using free creation and an- 
nihilation operators (the formula has also been discovered independently around 
the same time, in the more restrictive context of random walks on free products 
of groups, see [W]). Later, Speicher used his combinatorial approach to freeness to 
give another proof of Theorem 5. We shall describe Speicher's proof in section 5. 

Let us give an example of the computation of EEL Let 

\i = v= ^(d + <5i) 



one has 



n (r\ C- 1/2 z + l + VTT 

G M (C) = — 7T and K^z) - 



C(C-l) MW 2z 

This gives 

i ; W^) = i + -v / i + ^ 2 

and 

G " ffl " (0 = TW^T) 

The Cauchy transform can be inverted to give 

|U EB /i(dx) = -.dx on [0, +2]. 

7Ta/x(2 — x) 

This is the famous arcsine distribution (here on the interval [0, +2]). If we recall 
Theorem 3, we have the following striking interpretation of this computation: take 
a large integer N, and two subspaces of C^, of dimension [N/2], then for most 
choices of these subspaces, the sum of the corresponding orthogonal projections has 
an eigenvalue distribution which is well approximated by the arcsine distribution. 
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4. Theory of addition of free random variables 



Once one has defined the free convolution of measures, one may try to develop the 
theory of addition of free random variables in parallel with the theory of addition 
of independent random variables. It turns out that most of the well known classical 
results, such as the law of large numbers, the central limit theorem, or the Levy- 
Khintchine formula, have free analogues, and this develops into a beautiful new 
branch of mathematics, which sheds new lights on some well known results like 
the Wigner theorem on spectra of random gaussian matrices. We shall give a brief 
survey of these results below, but before that we need to extend the free convolution 
to probability measures with unbounded support. 

4.1 Free convolution of measures with unbounded support. Let \i be an 

arbitrary probability measure on R and let 



be its Cauchy transform. Since /i may very well have no moment at all, we no 
longer have the expansion of into a power series in however the following 
is nevertheless true, let 



For every a > 0, there exists a real number (3 > such that the function C7 M has 
a right inverse defined on the domain a>J g, taking values in some domain of the 
form 



with 7, A > 0. Call this right inverse, and let R^(z) = K^(z) — -. The function 
is called the .R-transform of the measure fx. Given another probability measure 
v on M, we shall define a new probability measure \i EB v by the requirement that 



on some domain of the form a> /3, where these three functions are defined. It 
turns out that this definition is meaningful, and it is clear that it coincides with 
the previous definition in the case where fx and v have compact support. In fact, 
there is also an interpretation of \i EB v as the distribution of the sum of two free 
(unbounded in general) self-adjoint elements affiliated to some non-commutative 
probability space, see [BV] for details. 

4.2 Law of large numbers. The most general formulation of the law of large 
numbers for sums of independent equidistributed random variables asserts that the 
distribution of — {X\ + . . . + X n — M n ) converges weakly towards the Dirac measure 
at zero, for some constants M n , if and only if the common distribution \i of the Xj 
satisfies tfj,(R \ [—t,t]) — > as t — > oo. It turns out that exactly the same result 
holds if one replaces independent random variables by free ones. This is proved in 




= {z = x + iy I y < 0; ay < x < -ay; \z\ < (3} 



r 7 ,A = {z = x + iy\y > 0; -^y < x < 7?/; \z\ > A} 



R^mv — R/j, + Rv 



[BP]. 
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4.3 The Central Limit Theorem. Let \x be a probability measure having zero 
mean and finite variance cr, then one has R^(z) ~ az as z — > 0. Let now 
Xi, . . . , X n , ... be a sequence of identically distributed free random variables, with 
common distribution (i : then it is easy to see that the distribution of Xl+ ^- Xn has 

.R-transform y/nR^-^) , which converges, as n — > oo, towards az. Using continuity 
properties of the .R-tranform, it is not difficult to see that this implies the following 
free central limit theorem 

Theorem 6. The distribution of Xl+ ^ Xn converges weakly, as n — > oo, towards 
the distribution with R-transform az. 

The distribution appearing in the free central limit theorem can be computed, by 
inverting the .R-transform, it is the famous Wigner semi-circular distribution given 
by the density ^f^v^cr — x 2 on the interval [—2y/a, 1\fa\. This central limit theo- 
rem, together with Theorem 3 on asymptotic freeness of random matrices, provides 
a conceptual framework for the well known result of Wigner, on the asymptotic 
behaviour of the spectra of large random matrices with gaussian entries. 

4.4 Infinitely divisible distributions. There is a notion of infinitely divisible 
measures for the free additive convolution, which is the obvious one, namely a 
probability measure \i on the real line is said to be freely infinitely divisible if for 
every positive integer n, there exists a measure /i n such that fj^ n = \i. The following 
characterization of freely infinitely divisible measures on R has been obtained by 
Bercovici and Voiculescu [BV]. 

Theorem 7. A probability measure fx, on R is infinitely divisible if and only if the 
function R^ has an analytic continuation to the whole ofC + , with values in C~ UR, 
and one has 

y^0,yeK. 

Furthermore any analytic function R having the above properties is the R-transform 
of some probability measure. 

Functions such as those appearing in Theorem 7, have a Nevanlinna representa- 
tion, which can be seen here as the free analogue of the Levy-Khintchine formula. 
More precisely, let R be the .R-transform of some infinitely divisible probability 
measure, then there exists a real number a, and a finite positive measure u, on R, 
such that 



R{z) = a+ du(t) 

J-oo 1-tZ 



The extreme points in this integral representation have an interpretation similar 
to the one of the classical Levy-Khintchine formula, namely the function R(z) = 
a is the .R-transform of a Dirac mass at the point a, the function R(z) = az, 
corresponding to a point mass at zero for u, is the R transform of the semi-circular 
distribution with variance a. Finally the probability measure with .R-transform 
R{z) = Xfzr^ is the free analogue of the Poisson distribution, namely, it is the 
weak limit, as n — > oo of the measures ((1 — ^)5 + ^S t ) mn (the "free binomial 
distributions" ) . 

4.5 Stable distributions. One can define stable distributions exactly as for clas- 
sical convolution, namely, a probability measure on R is called stable, if and only if 
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the set of probability measures obtained from /i by applying affine transformations 
of R, is stable under free convolution. The set of all free stable probability measures 
on the real line has been determined by Bercovici and Voiculescu [B V] . It turns out 
that there is a natural one-to-one correspondance between stable distributions and 
free stable distributions, and the domains of attraction are the same in the classical 
and the free cases. In fact, any free stable distribution is the image by an affine 
transformation of a distribution whose .R-transform belongs to the following list 

(1) R(z) = e™ e z a ~ x where 1< a < 2, and (a - 2) < 9 < 0. 

(2) R(z) =a + blogz where a G C + U R and b > -$ta/ir. 

(3) R{z) = e^z*- 1 where < a < 1, and 1 < < 1 + a. 

In particular, the Cauchy distribution is a free stable distribution of stability index 
one. 

For these results, see [BVB] 

5. SPEICHER'S COMBINATORIAL APPROACH TO FREENES 

We shall now describe Speicher's combinatorial approach to the computation 
of the coefficients c(ir,H) of Proposition 1, and some applications. A thorough 
discussion is given in [Sp2]. 

Let S be a totally ordered set. A partition of the set S is said to have a crossing if 
there exists a quadruple (i, j, k, I) G >S 4 , with i < j < k < /, such that i and k belong 
to some class of the partition and j and / belong to another class. If a partition has 
no crossing, it is called non-crossing. The set of all non-crossing partitions of S is 
denoted by NC(S), it is a lattice with respect to the dual refinement order (such 
that 7T < a if n is a finer partition than a). 

When S = {1, . . . , n}, with its natural order, we will use the notation NC(n). 
Here is an example with n = 8, tt = {{1, 4, 5}, {2}, {3}, {6, 8}, {7}}. 



i 




5 



We draw a segment joining each point with the next point in the same class of 
the partition. The non-crossing condition means that segments should not intesect 
inside the circle. 

Let (A, <p) be a non-commutative probability space, then we shall define a family 
i?( n ) of n- multilinear forms on A, for n > 1, by the following formula 

<p(a 1 ...a n )= ^2 R[^](ai,... ,a n ) 

ireNC(n) 
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Here, for n G NC(n), one has 

R[7r}( ai ,...,a n )= Y[R^\av) 

Veir 

where ay = (a jl7 . . . , a Jfe ) if V = {j±, ... ,jk} is a class of the partition n, with 
ji < J2 < • • • < 3k- In particular R[l n ] = R^ n ' if l n is the partition with only one 
class. For example, one has, for all a E A, 

(p(a) = R(a) 

(we forget the superscript (n) when it is clear which n is considered), for n = 2 

(p(aia 2 ) = R(ai, a 2 ) + R(ai)R(a 2 ) 

thus 

R(ai, a 2 ) = <f(aia 2 ) - <p{a\)<p{a 2 ) 
is the covariance of a± and a 2 . Finally, 

(p(aia 2 a 3 ) =R(ai, a 2 , a 3 ) + R(ai)R(a 2 , a 3 ) + R{a 2 )R(ai, a 3 ) 
+ R(a 3 )R(a 1 , a 2 ) + i?(ai) J R(a 2 ) J R(a 3 ) 

which yields 

R(ai, a 2 , a 3 ) =</?(aia 2 a 3 ) - ^(ai)y?(a 2 a 3 ) - ip(a 2 )ip(aia 3 ) 
- (/?(a 3 )^(aia 2 ) + 2<^(ai)^(a 2 )^(a 3 ) 

In general, for each n, one has 

(p(cii . . . a n ) = R^(ai, . . . , a n ) + terms involving R^ k > for k < n 

so that the R^ n ' are well defined and can be computed by induction on n. 

The non-crossing cumulants can be expressed explicitly in terms of the moments 
by the following formula 

R {n \a u ... ,a n ) = Moe6(7r)^[7r](ai,... ,a n ) 

ireNC(n) 

Here <^[7r](ai, ... ,a n ) = Yl Veir f(aj 1 . . .a jk ) where V = {ji, . . . , jk} are the classes 
of 7r, and Moeb is the Mobius function on NC(n) defined by 

Moeb(n)= J] (-1)1^^ 

VEtt 

where c n = n ^n+iy. is the n th Catalan number. 

The connection between non-crossing cumulants and freeness is the following 
result of section 4 of [Spl]. 
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Proposition 2. Let (Bi)igj be free subalgebras of A, and a±, . . . ,a n G A be such 
that cij belongs to some Bi j for each j e {1, 2, . . . , n}. Then R(a,i, . . . , a n ) = if 
there exists some j and k with ij ^ ik ■ 

Proposition 2 is the key to the computation of the coefficients c(ir, II) of Propo- 
sition 1. Indeed, let ai, . . . ,a n be an arbitrary sequence in A, such that each aj 
belongs to one of the algebras B^ then we have 

<p(a 1 ...a n )= Y R[n](a u ... ,a n ) 

ireNC(n) 

and in this sum the terms corresponding to partitions 7r having a class containing 
two elements j, k such that aj and au belong to distinct algebras give a zero contri- 
bution. Thus we have to sum over partitions in which all j belonging to a certain 
class are such that aj belongs to the same algebra, and the value of R[tc] (ai, . . . , a n ) 
can be expanded in terms of the restriction of f to each of the subalgebras. 

The multilinear forms allow us to recover the free cumulants, indeed one 
has 

I x n »(dx) = <p(X n ) = R W ( X '---' X ) 

jR TreNC(n) 

Proposition 3. Let X e A be self adjoint and have distribution n(dx), then the 
free cumulants of the measure \i are given by the formula C n (/j) = R^(X, . . . , X), 
for n = 1, 2 . . . . 

Using Propositions 2 and 3, we can now give a proof of Theorem 5. Let X and Y 
be free random variables with respective distributions [i and u, then the cumulants 
of \x EH v are given, according to Proposition 3, by 

C n (fi EH u) = R (n) (X + Y, . . . , X + Y) 

Since R^ is an n- linear form, we can expand R^ n \X-\-Y, ... ,X + Y), into a sum of 
terms R^^Zi, Z2, ■ ■ ■ , Z n ), where each Zi is either X or Y. Applying Proposition 
2, we see that all these terms are zero, except R( n \X, . . . , X) and R( n \Y, ... ,Y), 
thus we have 

R( n )(X + Y,... ,X + Y) = R (n \X,... , X) + R( n \Y, . . . ,Y) 

and Theorem 5 follows from Proposition 3 again. The proofs of Propositions 2 and 
3 can be found in [Spl] or [Sp2]. 

6. Some further topics 

6.1 Multiplicative free convolution. Given two unitary elements U, V, which 
are free in some non-commutative probability space (A, <p), we can form their prod- 
uct, which is again a unitary element. The distributions of U and V are this time 
probability measures say \x and v, on the set T of complex numbers of modulus one, 
and the distribution of UV, which depends only on fi and v, can be computed by 
the so-called E-transform. Let us introduce the ip- function of a probability measure 
H, on T, by 
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This is a convergent power series in D = {z E C | \z\ < 1}, the open unit disk of C, 
such that Vv(0) = 0. Let .M* be the set of probability measures on T such that 
f T £,dfi(£,) 7^ 0. If n E M*, the function has a right inverse, called x M , defined 

in a neighbourhood of 0, such that x^(0) = 0, and we let E^z) = \x,n(z) be the 
E-transform of \i. Then, for any measures fi,v E M* : one has fi M v E M* and 

= ^fj,(z)T,„ (z) 

in some neighbourhood of zero where these three functions are defined. 

This formula was first found by Voiculescu in [V2] . His proof was quite compli- 
cated, and a simpler one has been given by U. Haagerup. A proof using non-crossing 
partitions, due to Nica and Speicher is in [NS1]. 

There is an analogue, for free multiplicative convolution on T, of the Levy- 
Khinchine formula. This states that a probability measure on T is infinitely divis- 
ible, for the free multiplicative convolution, if and only if its E transform can be 
written as E^z) = exp(u(z)) where u is an analytic function on D, taking values 
with nonnegative real parts. Such a function has a representation of the form 

u(z) = ia+ [ ] + ^ Z dv{C) 

J T 1 — (,2 

for some finite positive measure v on T, and real number a. 

Finally one can also define multiplicative free convolution for measures on R + . 
We shall refer to [BV] for these topics. 

We shall here make a remark on some features which distinguish free probability 
from classical probability. Let fi be a probability measure on R, and let Q be (a 
suitable branch of) the logarithm of its Fourier transform, then the set of positive 
real numbers t such that tQ is again the logarithm of the Fourier transform of some 
probability measure is a closed additive subsemigroup of R + , containing the positive 
integers. It is equal to M.+ , exactly when \i is infinitely divisible, but it can also be 
reduced to the set of positive integers (e.g if \i is a Bernoulli distribution). In free 
probability, the analogue of the function Q is the i?-transform. Again, the set of t 
such that tR^ is the i?-transform of some probability measure is a closed additive 
subsemigroup of R+, but this time this subsemigroup always contains the interval 
[1, +oo[. In fact the measure with .R-transform tR^ has a nice description in terms 
a free compression. Namely, let X be self-adjoint, with distribution fx in (A,<p), 
and let 7r be a self-adjoint projection in A, free with A, and such that <p(tt) = |, 
with t E [1, oo[. The distribution of fi is the Bernoulli distribution (1 — j)8o + j8\. 
The set ttAit of elements of the form tvZtv, for Z E A, is an algebra, with ir as a 
unit, and (ttAtt, tip) is a non-commutative probability space. The distribution of 
the element tjiXiv in (it An, tip), has a distribution whose .R-transform is given by 
tR^ (try to show this using what you know about i?-transforms and non-crossing 
partitions!). This shows that tR^ is the .R-transform of some probability measure, 
for all i > 1. As the example of Bernoulli distribution shows, there are probability 
measures fi for which tR^ is the .R-transform of some probability measure only for 
t E {0} U [1, oof. 

In order to close this section, let us note that the problem of finding the distri- 
bution of the commutator of two free self-adjoint random variables has been solved 
recently by Nica and Speicher [NS2] . 
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6.2 More about random matrices. As we have seen in section 2, free probability 
provides us with a good understanding of the way that spectra of large matrices 
behave under addition. So far we have not said anything about eigenvectors. It 
turns out that free probability again has something to tell us about this. We shall 
again consider two large hermitian matrices A and B, whose spectra are known. 
Thinking of A + B as a perturbation of the matrix A, we would like to know how 
the eigenvectors of A + B are related to those of A. It is clear that if B is small 
compared to A, then the eigenvectors of A + B should be close to those of A. 
Let us denote by 71, . . . ,7^ the normalized eigenvectors of A, associated with the 
eigenvalues Ai, . . . , Ajv, whereas we denote by £1, . . . , £n the eigenvectors of A + B, 
associated with the eigenvalue i>i, . . . , v^. The passage from the old basis to the 
new basis is given by the transition matrix (£fc,7j). In fact, since the eigenvectors 
are only defined up to some complex number of modulus one, we shall only consider 
the numbers 7z)| 2 , which form a bistochastic matrix. These numbers are quite 
hard to tackle, and they may not have a definite asymptotic behaviour, as N — > 00, 
so we shall evaluate them again some test functions. Let / and g be smooth 
functions on R, then we shall look at the asymptotic behaviour of the expression 

l<k,l<N 

We can rewrite this as 

tr(g(A + B)f(A)) 

then using our result on asymptotic behaviour of large matrices, it is easy to see 
that, if the empirical distributions on the eigenvalues of A and B converge, as 
N — > 00, towards p and u, and we choose matrices at random as in Theorem 3, 
then the expression 

Pr(g(A> N + B> N )f(A> N )) 

will converge, in probability, as N — > 00, towards 

<p(g(X + Y)f(X)) 

where X and Y are free self-adjoint elements, with respective distributions v\ and 
V2- In principle, we can compute the value of such an expression, for example if / 
and g are polynomials. In fact it is easy, by a positivity argument, to see that this 
value is given by J f(x)g(u)p(dx,du) where p is some probability measure on R 2 . 
It turns out that this probability measure can be desintegrated along the values of 
x as p(dx, du) = k(x, du)p(dx), where k(x, du) is a Markov transition kernel, which 
could be thought of as the limit of the bistochastic matrices |(6c,7;)| 2 , and this 
Markov kernel can be explicitly computed, namely it is characterized by 

/ (C - u)- 1 ^, du) = (F(C) - x)- 1 for all ( E C \ R 
Jr 

for some function FonC\f such that F(() = F((), F(C+) C C+, Im(F(()) > 
Im(C) for C e C+, and -> 1 as y -> +00, y G R, and 

G M (F(0) = G,&,{Q 

for all C e C \ R. The map F is uniquely determined by these properties. 

This result appears in [Bi], where it is applied to the theory of processes with 
free increments. 



16 



PHILIPPE BIANE 



References 

[Bi] P. Biane, Processes with free increments, Math. Z. 227 (1998), 143-174. 

[BP] H. Bercovici and V. Pata, The law of large numbers for free identically distributeds random 

variables, Ann. Proba. 24 (1996), 453-465. 
[BPB] H. Bercovici and V. Pata, Domains of attraction of free stable distributions; with an 

appendix by P. Biane on the density of free stable distributions, Ann. Math, (to appear). 
[BV] H. Bercovici and D. Voiculescu, Free convolution of measures with unbounded support, 

Indiana University Mathematics Journal 42 (1993), 733-773. 
[F] W. Fulton, Eigenvalues of sums of Hermitian matrices (after A. Klyachko), Seminaire 

Bourbaki, expose 845, Juin 1998. 
[M] B. Maurey, Construction de suites symetriques, C. R. Acad. Sc. Ser. A 288 (1979), 679- 

681. 

[NS1] A. Nica and R. Speicher, On the multiplication of free n-tuples of non- commutative random 

variables, Amer. J. Math. 118 (1996), 799-837. 
[NS2] A. Nica and R. Speicher, Commutators of free random variables, Duke Math. J. 92 (1998), 

553-592. 

[Spl] R. Speicher, Multiplicative functions on the lattice of non-crossing partitions and free 
convolution, Math. Annalen 298 (1994), 141-159. 

[Sp2] R. Speicher, Combinatorial theory of the free product with amalgamation and operator- 
valued free probability theory, Mem. A. M.S., vol. 627, 1998. 

[Vol] D. V. Voiculescu, Addition of non-commuting random variables, J. Operator Theory 18 
(1987), 223-235. 

[Vo2] D. V. Voiculescu, Limit laws for random matrices and free products, Invent. Math. 104 
(1991), 201-220. 

[VDN] D. V. Voiculescu, K. Dykema and A. Nica, Free random variables, CRM Monograph Series 

No. 1, Amer. Math. Soc, Providence, RI, 1992. 
[W] W. Woess, Random walks on infinite graphs and groups - asurvey on selected topics, Bull. 

London Math. Soc. 26 (1994), 1-60. 
[X] F. Xu, A random matrix model from two-dimensional Yang-Mills theory., Comm. Math. 

Phys. 190 (1997), 287-307. 

CNRS, DMI, Ecole Normale Superieure 
45, rue d'Ulm, 75005 Paris FRANCE 



